Goto

Collaborating Authors

 Essex County


Low Rank and Sparse Fourier Structure in Recurrent Networks Trained on Modular Addition

arXiv.org Machine Learning

Low Rank and Sparse Fourier Structure in Recurrent Networks Trained on Modular Addition Akshay Rangamani Dept. of Data Science New Jersey Institute of T echnology Newark, NJ, USA akshay.rangamani@njit.edu Abstract --Modular addition tasks serve as a useful test bed for observing empirical phenomena in deep learning, including the phenomenon of grokking. Prior work has shown that one-layer transformer architectures learn Fourier Multiplication circuits to solve modular addition tasks. In this paper, we show that Recurrent Neural Networks (RNNs) trained on modular addition tasks also use a Fourier Multiplication strategy. We identify low rank structures in the model weights, and attribute model components to specific Fourier frequencies, resulting in a sparse representation in the Fourier space. We also show empirically that the RNN is robust to removing individual frequencies, while the performance degrades drastically as more frequencies are ablated from the model.


Linguistic Knowledge Transfer Learning for Speech Enhancement

arXiv.org Artificial Intelligence

Linguistic knowledge plays a crucial role in spoken language comprehension. It provides essential semantic and syntactic context for speech perception in noisy environments. However, most speech enhancement (SE) methods predominantly rely on acoustic features to learn the mapping relationship between noisy and clean speech, with limited exploration of linguistic integration. While text-informed SE approaches have been investigated, they often require explicit speech-text alignment or externally provided textual data, constraining their practicality in real-world scenarios. Additionally, using text as input poses challenges in aligning linguistic and acoustic representations due to their inherent differences. In this study, we propose the Cross-Modality Knowledge Transfer (CMKT) learning framework, which leverages pre-trained large language models (LLMs) to infuse linguistic knowledge into SE models without requiring text input or LLMs during inference. Furthermore, we introduce a misalignment strategy to improve knowledge transfer. This strategy applies controlled temporal shifts, encouraging the model to learn more robust representations. Experimental evaluations demonstrate that CMKT consistently outperforms baseline models across various SE architectures and LLM embeddings, highlighting its adaptability to different configurations. Additionally, results on Mandarin and English datasets confirm its effectiveness across diverse linguistic conditions, further validating its robustness. Moreover, CMKT remains effective even in scenarios without textual data, underscoring its practicality for real-world applications. By bridging the gap between linguistic and acoustic modalities, CMKT offers a scalable and innovative solution for integrating linguistic knowledge into SE models, leading to substantial improvements in both intelligibility and enhancement performance.


Learning to Measure Quantum Neural Networks

arXiv.org Artificial Intelligence

The rapid progress in quantum computing (QC) and machine learning (ML) has attracted growing attention, prompting extensive research into quantum machine learning (QML) algorithms to solve diverse and complex problems. Designing high-performance QML models demands expert-level proficiency, which remains a significant obstacle to the broader adoption of QML. A few major hurdles include crafting effective data encoding techniques and parameterized quantum circuits, both of which are crucial to the performance of QML models. Additionally, the measurement phase is frequently overlooked-most current QML models rely on pre-defined measurement protocols that often fail to account for the specific problem being addressed. We introduce a novel approach that makes the observable of the quantum system-specifically, the Hermitian matrix-learnable. Our method features an end-to-end differentiable learning framework, where the parameterized observable is trained alongside the ordinary quantum circuit parameters simultaneously. Using numerical simulations, we show that the proposed method can identify observables for variational quantum circuits that lead to improved outcomes, such as higher classification accuracy, thereby boosting the overall performance of QML models.


Transfer Learning Analysis of Variational Quantum Circuits

arXiv.org Artificial Intelligence

This work analyzes transfer learning of the Variational Quantum Circuit (VQC). Our framework begins with a pretrained VQC configured in one domain and calculates the transition of 1-parameter unitary subgroups required for a new domain. A formalism is established to investigate the adaptability and capability of a VQC under the analysis of loss bounds. Our theory observes knowledge transfer in VQCs and provides a heuristic interpretation for the mechanism. An analytical fine-tuning method is derived to attain the optimal transition for adaptations of similar domains.


Let's Think Var-by-Var: Large Language Models Enable Ad Hoc Probabilistic Reasoning

arXiv.org Artificial Intelligence

A hallmark of intelligence is the ability to flesh out underspecified situations using "common sense." We propose to extract that common sense from large language models (LLMs), in a form that can feed into probabilistic inference. We focus our investigation on $\textit{guesstimation}$ questions such as "How much are Airbnb listings in Newark, NJ?" Formulating a sensible answer without access to data requires drawing on, and integrating, bits of common knowledge about how $\texttt{Price}$ and $\texttt{Location}$ may relate to other variables, such as $\texttt{Property Type}$. Our framework answers such a question by synthesizing an $\textit{ad hoc}$ probabilistic model. First we prompt an LLM to propose a set of random variables relevant to the question, followed by moment constraints on their joint distribution. We then optimize the joint distribution $p$ within a log-linear family to maximize the overall constraint satisfaction. Our experiments show that LLMs can successfully be prompted to propose reasonable variables, and while the proposed numerical constraints can be noisy, jointly optimizing for their satisfaction reconciles them. When evaluated on probabilistic questions derived from three real-world tabular datasets, we find that our framework performs comparably to a direct prompting baseline in terms of total variation distance from the dataset distribution, and is similarly robust to noise.


Fox News AI Newsletter: Jobs AI can't take

FOX News

Mehmet Aytekin, 28, left, checks his cell phone while waiting to board his United Airlines flight to Newark, N.J. at O'Hare International Airport on Jan. 3, 2020. Amid high costs and controversies surrounding college education – coupled with the threat that artificial intelligence poses on certain white-collar jobs – much of Gen Z is leaning toward pursuing trade schools and blue-collar jobs with that tech gap in mind. IN ITS'PRIME': Amazon.com reported record first-quarter sales as the AI boom powered growth in its cloud-computing unit, helping the company continue to shake off last year's post-pandemic slump. FUTURE'S NOT SET: Policymakers should not reference or rely on fictional scenarios as reasons to regulate AI. Otherwise, America risks losing its global lead on AI and American citizens could never realize the full benefits of the technology.


Whispers of Doubt Amidst Echoes of Triumph in NLP Robustness

arXiv.org Artificial Intelligence

Are the longstanding robustness issues in NLP resolved by today's larger and more performant models? To address this question, we conduct a thorough investigation using 19 models of different sizes spanning different architectural choices and pretraining objectives. We conduct evaluations using (a) OOD and challenge test sets, (b) CheckLists, (c) contrast sets, and (d) adversarial inputs. Our analysis reveals that not all OOD tests provide further insight into robustness. Evaluating with CheckLists and contrast sets shows significant gaps in model performance; merely scaling models does not make them sufficiently robust. Finally, we point out that current approaches for adversarial evaluations of models are themselves problematic: they can be easily thwarted, and in their current forms, do not represent a sufficiently deep probe of model robustness. We conclude that not only is the question of robustness in NLP as yet unresolved, but even some of the approaches to measure robustness need to be reassessed.


Conversation Style Transfer using Few-Shot Learning

arXiv.org Artificial Intelligence

Conventional text style transfer approaches focus on sentence-level style transfer without considering contextual information, and the style is described with attributes (e.g., formality). When applying style transfer in conversations such as task-oriented dialogues, existing approaches suffer from these limitations as context can play an important role and the style attributes are often difficult to define in conversations. In this paper, we introduce conversation style transfer as a few-shot learning problem, where the model learns to perform style transfer by observing only a few example dialogues in the target style. We propose a novel in-context learning approach to solve the task with style-free dialogues as a pivot. Human evaluation shows that by incorporating multi-turn context, the model is able to match the target style while having better appropriateness and semantic correctness compared to utterance/sentence-level style transfer. Additionally, we show that conversation style transfer can also benefit downstream tasks. For example, in multi-domain intent classification tasks, the F1 scores improve after transferring the style of training data to match the style of the test data.


Lead ETL Data Engineer at Verisk - Newark, NJ, United States

#artificialintelligence

We help the world see new possibilities and inspire change for better tomorrows. Our analytic solutions bridge content, data, and analytics to help business, people, and society become stronger, more resilient, and sustainable. The Data Engineering and Analytics Lab (DEAL) is a team of technical actuaries responsible for the design and implementation of our core statistical data-systems including data ingestion, data integration, data transformation, data analysis, and analytic dataset construction. We're an innovation group that is charged with visualizing the future of our organization's operations and leveraging our expertise in data, technology, P&C insurance, and process optimization to provide a first-class analytics environment to our data-collection, data-management, actuarial, and data-analytics colleagues. The DEAL team is looking to hire an experienced Lead ETL Data Engineer, ideally having a good combination of an analytical/innovative mindset, technical aptitude, business accumen, communication skills, and a passion for mentoring.


To Smart to Fail

#artificialintelligence

Good afternoon, proud graduates of the University of Machine Learning class of … In retrospect, it was a bad day for Artificial Intelligence.